I Want To Give My Models A Delete Token So They Can Undo Their Mistakes
I have an idea. It is simple. It is probably stupid. I want to give my AI models a delete token. A special token they can output to signal they want to undo their previous output and try again. Like a backspace key for autoregressive generation. Like ctrl-z for language models.
When your model outputs nonsense and you wish it could just take it back, you start imagining tokens that delete tokens. This is where bad ideas begin. This is also where I live.
The Core Concept
Here is how it would work. The model generates tokens normally. At any point, if it detects it is heading in a wrong direction, it outputs a special [DELETE] token. This token signals the inference engine to remove the previous token from the context. The model then generates a replacement. It can do this multiple times. It can backtrack. It can self-correct. It can learn from its mistakes in real time.
context = initial_prompt
while not done:
token = model.generate(context)
if token == [DELETE]:
context = context[:-1] # Remove last token
continue
context.append(token)
# Simple. Elegant. Probably breaks everything.
The model learns when to delete during training. It sees examples where backtracking leads to better outputs. It learns to recognize its own mistakes. It learns to correct itself before the mistake propagates. The delete token becomes a tool for self-improvement. Or it becomes a crutch for indecisive generation. Both outcomes are possible.
Why This Might Help
Small models make mistakes. My one million parameter models output confident nonsense. They commit to wrong answers early. They cannot revise. They cannot reconsider. They just keep generating until the token limit hits. A delete token would give them a chance to course-correct. To admit error. To try again.
This aligns with how humans write. We draft. We revise. We delete. We rewrite. Autoregressive generation lacks this loop. It is linear. It is irreversible. It is unforgiving. A delete token reintroduces revision. It makes generation iterative. It makes mistakes recoverable.
Perfection is not the absence of mistakes. It is the ability to fix them. My models cannot fix mistakes. A delete token might change that. Or it might just make them delete everything and give up. Both are educational.
Training The Delete Behavior
The hard part is teaching the model when to delete. I would need training data that includes backtracking. Examples where the model outputs a wrong token, then deletes it, then outputs a correct replacement. I could synthesize this data. I could take correct sequences, inject errors, then show the correction. I could train the model to recognize its own uncertainty and trigger deletion accordingly.
Prompt: "Water is made of"
Model output: "resins [DELETE] H2O"
Target: "H2O"
Loss: Penalize "resins", reward "H2O", encourage [DELETE] as correction
# The model learns that deleting wrong answers is good.
I could also use reinforcement learning. Reward sequences that use delete tokens to reach better final outputs. Penalize excessive deletion. Balance correction with efficiency. This is feasible. This is also a lot of work for a feature that might just make my models more hesitant.
Potential Problems
The delete token could be abused. The model could delete everything and output nothing. It could enter infinite delete loops. It could become overly cautious and never commit to any token. It could learn that deletion is safer than generation and just backspace forever. These are real risks. These are also funny to imagine.
There is also the inference complexity. Deleting tokens requires managing the context window dynamically. It requires tracking which tokens are active. It requires careful implementation to avoid bugs. I am the person who implements things. I am also the person who introduces bugs. The risk is high. The reward is uncertain.
Why I Am Thinking About This
I watch my models generate wrong answers with confidence. I watch them commit to errors early. I watch them continue down bad paths because they cannot revise. I wish they could just take it back. I wish they could learn from their mistakes in real time. A delete token is my attempt to give them that ability.
This is not about making models perfect. This is about making mistakes recoverable. This is about adding a layer of self-correction to autoregressive generation. This is about acknowledging that small models will be wrong sometimes and giving them a tool to fix it.
Final Thoughts
I want to give my models a delete token. So they can undo mistakes. So they can self-correct. So they can learn from errors in real time. The idea is simple. The implementation is hard. The outcome is uncertain.
I will probably try it. I will probably break something. I will probably learn something. That is the cycle. That is the process. That is how progress happens in my tiny corner of AI research.
If you have thoughts on this idea, let me know. If you have tried something similar, share your results. If you think this is a terrible idea, you are probably right. But terrible ideas sometimes lead to good ones. That is how innovation works. That is how I work.